I’m often asked why I’m so obsessed with backward compatibility and, as a result, why I’ve made the issue such a central part of the LSB over the past year. Yes, it’s hard, particularly in the Linux world, because there are thousands of developers building the components that make up the platform, and it just takes one to break compatibility and make our lives difficult. Even worse, the idea of keeping extraneous stuff around for the long term “just” for the sake of compatibility is anathema to most engineers. Elegance of design is a much higher calling than the pedestrian task of making sure things don’t break.
Why is backward compatibility important? Here’s a great example, via Joel Spolsky (note: from 2004):
Raymond Chen is a developer on the Windows team at Microsoft. He’s been there since 1992, and his weblog The Old New Thing is chock-full of detailed technical stories about why certain things are the way they are in Windows, even silly things, which turn out to have very good reasons.
The most impressive things to read on Raymond’s weblog are the stories of the incredible efforts the Windows team has made over the years to support backwards compatibility: “Look at the scenario from the customer’s standpoint. You bought programs X, Y and Z. You then upgraded to Windows XP. Your computer now crashes randomly, and program Z doesn’t work at all. You’re going to tell your friends, ‘Don’t upgrade to Windows XP. It crashes randomly, and it’s not compatible with program Z.’ Are you going to debug your system to determine that program X is causing the crashes, and that program Z doesn’t work because it is using undocumented window messages? Of course not. You’re going to return the Windows XP box for a refund. (You bought programs X, Y, and Z some months ago. The 30-day return policy no longer applies to them. The only thing you can return is Windows XP.)”
I first heard about this from one of the developers of the hit game SimCity, who told me that there was a critical bug in his application: it used memory right after freeing it, a major no-no that happened to work OK on DOS but would not work under Windows where memory that is freed is likely to be snatched up by another running application right away. The testers on the Windows team were going through various popular applications, testing them to make sure they worked OK, but SimCity kept crashing. They reported this to the Windows developers, who disassembled SimCity, stepped through it in a debugger, found the bug, and added special code that checked if SimCity was running, and if it did, ran the memory allocator in a special mode in which you could still use memory after freeing it.
This was not an unusual case. The Windows testing team is huge and one of their most important responsibilities is guaranteeing that everyone can safely upgrade their operating system, no matter what applications they have installed, and those applications will continue to run, even if those applications do bad things or use undocumented functions or rely on buggy behavior that happens to be buggy in Windows n but is no longer buggy in Windows n+1…
A lot of developers and engineers don’t agree with this way of working. If the application did something bad, or relied on some undocumented behavior, they think, it should just break when the OS gets upgraded. The developers of the Macintosh OS at Apple have always been in this camp. It’s why so few applications from the early days of the Macintosh still work…
To contrast, I’ve got DOS applications that I wrote in 1983 for the very original IBM PC that still run flawlessly, thanks to the Raymond Chen Camp at Microsoft.
I can almost feel the revulsion among my readership right about now. However, next time you’re in Best Buy or CompUSA, look at the shelf of Windows applications, then compare it to the shelf of Mac applications, and perhaps you’ll better understand why it’s important.
Beyond the results speaking for themselves, I’ll argue that it takes a better engineer to move a platform forward while at the same time making sure things don’t break. It’s pretty easy to wash your hands of something and declare it to be someone else’s problem.